Goto

Collaborating Authors

 ij 0




Optimally Improving Cooperative Learning in a Social Setting

arXiv.org Artificial Intelligence

We consider a cooperative learning scenario where a collection of networked agents with individually owned classifiers dynamically update their predictions, for the same classification task, through communication or observations of each other's predictions. Clearly if highly influential vertices use erroneous classifiers, there will be a negative effect on the accuracy of all the agents in the network. We ask the following question: how can we optimally fix the prediction of a few classifiers so as maximize the overall accuracy in the entire network. To this end we consider an aggregate and an egalitarian objective function. We show a polynomial time algorithm for optimizing the aggregate objective function, and show that optimizing the egalitarian objective function is NP-hard. Furthermore, we develop approximation algorithms for the egalitarian improvement. The performance of all of our algorithms are guaranteed by mathematical analysis and backed by experiments on synthetic and real data.


Fast Projected Newton-like Method for Precision Matrix Estimation under Total Positivity

arXiv.org Artificial Intelligence

We study the problem of estimating precision matrices in Gaussian distributions that are multivariate totally positive of order two ($\mathrm{MTP}_2$). The precision matrix in such a distribution is an M-matrix. This problem can be formulated as a sign-constrained log-determinant program. Current algorithms are designed using the block coordinate descent method or the proximal point algorithm, which becomes computationally challenging in high-dimensional cases due to the requirement to solve numerous nonnegative quadratic programs or large-scale linear systems. To address this issue, we propose a novel algorithm based on the two-metric projection method, incorporating a carefully designed search direction and variable partitioning scheme. Our algorithm substantially reduces computational complexity, and its theoretical convergence is established. Experimental results on synthetic and real-world datasets demonstrate that our proposed algorithm provides a significant improvement in computational efficiency compared to the state-of-the-art methods.


Learning Large-Scale MTP$_2$ Gaussian Graphical Models via Bridge-Block Decomposition

arXiv.org Artificial Intelligence

This paper studies the problem of learning the large-scale Gaussian graphical models that are multivariate totally positive of order two ($\text{MTP}_2$). By introducing the concept of bridge, which commonly exists in large-scale sparse graphs, we show that the entire problem can be equivalently optimized through (1) several smaller-scaled sub-problems induced by a \emph{bridge-block decomposition} on the thresholded sample covariance graph and (2) a set of explicit solutions on entries corresponding to bridges. From practical aspect, this simple and provable discipline can be applied to break down a large problem into small tractable ones, leading to enormous reduction on the computational complexity and substantial improvements for all existing algorithms. The synthetic and real-world experiments demonstrate that our proposed method presents a significant speed-up compared to the state-of-the-art benchmarks.


Multi-Valued Neural Networks I A Multi-Valued Associative Memory

arXiv.org Artificial Intelligence

A new concept of a multi-valued associative memory is introduced, generalizing a similar one in fuzzy neural networks. We expand the results on fuzzy associative memory with thresholds, to the case of a multi-valued one: we introduce the novel concept of such a network without numbers, investigate its properties, and give a learning algorithm in the multi-valued case. We discovered conditions under which it is possible to store given pairs of network variable patterns in such a multi-valued associative memory. In the multi-valued neural network, all variables are not numbers, but elements or subsets of a lattice, i.e., they are all only partially-ordered. Lattice operations are used to build the network output by inputs. In this paper, the lattice is assumed to be Brouwer and determines the implication used, together with other lattice operations, to determine the neural network output. We gave the example of the network use to classify aircraft/spacecraft trajectories.


(Almost) Envy-Free, Proportional and Efficient Allocations of an Indivisible Mixed Manna

arXiv.org Artificial Intelligence

We study the problem of finding fair and efficient allocations of a set of indivisible items to a set of agents, where each item may be a good (positively valued) for some agents and a bad (negatively valued) for others, i.e., a mixed manna. As fairness notions, we consider arguably the strongest possible relaxations of envy-freeness and proportionality, namely envy-free up to any item (EFX and EFX$_0$), and proportional up to the maximin good or any bad (PropMX and PropMX$_0$). Our efficiency notion is Pareto-optimality (PO). We study two types of instances: (i) Separable, where the item set can be partitioned into goods and bads, and (ii) Restricted mixed goods (RMG), where for each item $j$, every agent has either a non-positive value for $j$, or values $j$ at the same $v_j>0$. We obtain polynomial-time algorithms for the following: (i) Separable instances: PropMX$_0$ allocation. (ii) RMG instances: Let pure bads be the set of items that everyone values negatively. - PropMX allocation for general pure bads. - EFX+PropMX allocation for identically-ordered pure bads. - EFX+PropMX+PO allocation for identical pure bads. Finally, if the RMG instances are further restricted to binary mixed goods where all the $v_j$'s are the same, we strengthen the results to guarantee EFX$_0$ and PropMX$_0$ respectively.


Sparse Graph Learning Under Laplacian-Related Constraints

arXiv.org Machine Learning

We consider the problem of learning a sparse undirected graph underlying a given set of multivariate data. We focus on graph Laplacian-related constraints on the sparse precision matrix that encodes conditional dependence between the random variables associated with the graph nodes. Under these constraints the off-diagonal elements of the precision matrix are non-positive (total positivity), and the precision matrix may not be full-rank. We investigate modifications to widely used penalized log-likelihood approaches to enforce total positivity but not the Laplacian structure. The graph Laplacian can then be extracted from the off-diagonal precision matrix. An alternating direction method of multipliers (ADMM) algorithm is presented and analyzed for constrained optimization under Laplacian-related constraints and lasso as well as adaptive lasso penalties. Numerical results based on synthetic data show that the proposed constrained adaptive lasso approach significantly outperforms existing Laplacian-based approaches. We also evaluate our approach on real financial data.


Bayesian Inference of Random Dot Product Graphs via Conic Programming

arXiv.org Machine Learning

We present a convex cone program to infer the latent probability matrix of a random dot product graph (RDPG). The optimization problem maximizes the Bernoulli maximum likelihood function with an added nuclear norm regularization term. The dual problem has a particularly nice form, related to the well-known semidefinite program relaxation of the maxcut problem. Using the primal-dual optimality conditions, we bound the entries and rank of the primal and dual solutions. Furthermore, we bound the optimal objective value and prove asymptotic consistency of the probability estimates of a slightly modified model under mild technical assumptions. Our experiments on synthetic RDPGs not only recover natural clusters, but also reveal the underlying low-dimensional geometry of the original data. We also demonstrate that the method recovers latent structure in the Karate Club Graph and synthetic U.S. Senate vote graphs and is scalable to graphs with up to a few hundred nodes.


Deconstructing word embedding algorithms

arXiv.org Artificial Intelligence

In general topology, an embedding is understood as an injective structure preserving The advent of efficient uncontextualized word embedding map, f: X Y, between two mathematical structures algorithms (e.g., Word2vec (Mikolov et al., X and Y. A word embedding algorithm (f) 2013) and GloVe (Pennington et al., 2014)) marked learns an inner-product space (Y) to preserve a linguistic a historical breakthrough in NLP. Countless researchers structure within a reference corpus of text, employed word embeddings in new models D (X), based on a vocabulary, V. The structure in to improve results on a multitude of NLP problems. D is analyzed in terms of the relationships between In this work, we provide a retrospective analysis words induced by their co-appearances, according of these groundbreaking models of the past, to a certain definition of context. In such an analysis, which simultaneously offers theoretical insights for each word figures dually: (1) as a focal element how future models can be developed and understood.